A Parametric Network Approach for Concepts Hierarchy Generation in Text Corpus
نویسندگان
چکیده
The article presents a preflow approach for the parametric maximum flow problem, derived from the rules of constructing concepts hierarchy in text corpus. Just as generating a taxonomy can be equivalently reduced to ranking concepts within a text corpus according to a defined criterion, the proposed preflow bipush-relabel algorithm computes the maximum flow the optimum flow that respects certain ranking constraints. The parametric preflow algorithm for generating two level concepts hierarchy in text corpus works in a parametric bipartite association network and, on each step, the maximum possible amount of flow is pushed along conditional augmenting two-arcs directed paths in the parametric residual network, for the maximum interval of the parameter values. The obtained parametric maximum flow generates concepts hierarchies (taxonomies) in text corpus for different degrees of association values described by the parameter values.
منابع مشابه
Situation and Text: Representation of Migrants Whilst the Escalation of Refugee Crisis in Great Britain as Compared to Russia
Increasing migration is a vital concern for a globalizing sociocultural environment in today’s world. The UK and developed European countries have become an attractive destination for asylum seekers (labelled as “migrants”) in the past decade. The rapid rise in the number of asylum seekers, which was labelled “migration crisis” (Ruz, 2015), made this topic an integral part of scientific discuss...
متن کاملLearning Concept Hierarchies through Probabilistic Topic Modeling
With the advent of semantic web, various tools and techniques have been introduced for presenting and organizing knowledge. Concept hierarchies are one such technique which gained significant attention due to its usefulness in creating domain ontologies that are considered as an integral part of semantic web. Automated concept hierarchy learning algorithms focus on extracting relevant concepts ...
متن کاملGraph-based Approach to Automatic Taxonomy Generation (GraBTax)
We propose a novel graph-based approach for constructing concept hierarchy from a large text corpus. Our algorithm, GraBTax, incorporates both statistical co-occurrences and lexical similarity in optimizing the structure of the taxonomy. To automatically generate topic-dependent taxonomies from a large text corpus, GraBTax first extracts topical terms and their relationships from the corpus. Th...
متن کاملTowards The Web of Concepts: Extracting Concepts from Large Datasets
Concepts are sequences of words that represent real or imaginary entities or ideas that users are interested in. As a first step towards building a web of concepts that will form the backbone of the next generation of search technology, we develop a novel technique to extract concepts from large datasets. We approach the problem of concept extraction from corpora as a market-basket problem, ada...
متن کاملA Korean-Japanese-Chinese Aligned Wordnet with Shared Semantic Hierarchy
A Korean-Japanese-Chinese aligned wordnet, “CoreNet” is introduced. For the purpose of this paper, the term “wordnet” refers to a network of words. It is constructed based on a shared semantic hierarchy that is originated from NTT Goidaikei (Lexical Hierarchical System). Korean wordnet was constructed through the semantic category assignment to every meaning of Korean words in a dictionary. Ver...
متن کامل